perm filename 0[0,BGB]5 blob
sn#059728 filedate 1973-08-30 generic text, type T, neo UTF8
COMMENT ⊗ VALID 00005 PAGES
RECORD PAGE DESCRIPTION
00001 00001
00002 00002 DRAFT THESIS OUTLINE. DECEMBER 1972
00005 00003 I. MEMORY STRUCTURE.
00009 00004 I.B. Region-Edge Image Representation.
00010 00005 II. PROCESS.
00011 ENDMK
⊗;
DRAFT THESIS OUTLINE. DECEMBER 1972
GEOMETRIC VISION
- draft thesis outline -
B. G. Baumgart
ABSTRACT:
This thesis is about a computer vision system based on a
geometric model of the objects being viewed. In principle, this
vision system is simply a process that can be applied to a reel of
video tape to compute blueprints or geodetic maps. Applications of
this system to object recognition, scene analysis and robot vehicle
control are demonstrated.
CONTENTS:
I. MEMORY.
A. Representation of a Geometric Mental Universe.
B. Contour-Region-Edge Image Representation.
C. Semantic, Feature and Predicate Representation.
II. PROCESS.
A. Image Prediction.
B. Image Perception.
C. Image Comparison.
D. Camera Locus Solution.
E. World Model Modification.
1. delete object from map.
2. add known object to map. (recognition).
3. add or alter object in dictionary.
III. APPLICATION.
A. Blocks and Block Scenes.
1. deletion of a block from a scene.
2. addition of blocks to a scene.
B. Tools and Table Top Scenes.
1. complicated object perception.
2. known object recognition.
C. A Robot Vehicle and Outdoor Scenes.
1. known road servoing.
2. landscape perception.
I. MEMORY STRUCTURE.
In order to get a computer to deal with the physical world it
must have a data representation on which computations involving
space, time, shape, size and the appearance of things can be done. In
this section, a representation for the topology, geometry and
photometry of everyday things is explained. The data
structures discussed are implemented as small blocks of words
containing pointers and data in the fashion usual to graphics and
simulation; an introduction to this technology can be found in Knuth
[1]; and although the language of implementation is PDP-10 machine
code, the data and functions presented below are accessible from
higher level languages like LISP and ALGOL.
I.A. Representation of a Geometric Mental Universe.
At the top of the data structure is a single universe node
from which everything else can be reached. Immediately below the
universe node is a ring of world models. A robot dealing with
physical world sensor input, such as video data, has one of its world
models dedicated to simulating the immediate here and now; this
mental world is called the reality world model. In addition to the
reality world, a robot may have fantasy world models for problem
solving, planning or for recalling platonic object prototypes. In the
following, a two world mental universe will be the most common, with
the reality world being referred to as a "map" and the fantasy world
being referred to as a "dictionary".
Geometric world models have four basic kinds of nodes:
body, face, edge and vertex. The face, edge and vertex nodes are used
to form polyhedrons which may be attached to body nodes. Body nodes
in turn are connected to each other in rings and trees to form a
world model. Additional kinds of nodes discribe cameras and light
sources as well as temporary data such as shadows, spines, and
trajectories.
...continuation of this section follows AIM-179,
"Winged Edge Polyhedron Representation" - Baumgart.
I.B. Region-Edge Image Representation.
The image data structure presented in this section is a
computer's internal notation for what is vulgarly called a line
drawing; the common term is misleading because it does not suggest
the equally important space between the lines; terms closer to the
idea would be "mosaic drawing" or "stained glass window drawing".
The data structure has main levels: TV raster, video
intensity contour, arc contour, and region-edge.
...continuation of this section follows SAILON-71,
II. PROCESS.
A. Image Prediction.
B. Image Perception.
C. Image Comparison.
D. Camera Locus Solution.
E. World Model Modification.
1. delete object from map.
2. add known object to map. (recognition).
3. add or alter object in dictionary.
III. APPLICATION.
A. Block Scenes.
1. deletion of a block from a scene.
2. addition of blocks to a scene.
B. Tools and things.
1. complicated object perception.
2. known object recognition.
C. Robot Vehicle.
1. known road servoing.
2. landscape perception.